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Intellectual Property Rights 

IPRs essential or potentially essential to the present document may have been declared to ETSI. The information 
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found 
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in 
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web 
server ( http://ipr.etsi.org) . 

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee 
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web 
server) which are, or may be, or may become, essential to the present document. 



Foreword 

This Technical Specification (TS) has been produced by ETSI Technical Committee Speech and multimedia 
Transmission Quality (STQ). 



Introduction 

Speech terminals are currently implementing narrowband and wideband bandwidth. Terminal equipment may offer 
wider bandwidth, due to features already available in these terminals. Such equipment may implement conversational 
features that may benefit of the electroacoustic equipment already available in the terminal and may provide wider 
quality for the end users. 

The present document is intended to provide initial requirements and test methods for such type of equipment. 
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1 



Scope 



The present document provides speech & audio transmission performance requirements and measurement methods for 
headset functions of superwideband/fullband terminals. The present document provides requirements in order to 
optimize the end to end quality perceived by users. 

Users become more sensitive to voice and music quality (for music used in conversational services) when using 
ICT/terminal equipment and so are more demanding for further enhancement especially further extension of the audio 
coded bandwidth. 

For instance, this is the case for high quality conferencing services with music on hold, better background environment 
rendering and longer duration than normal point to point calls. 

Standardized superwideband and fullband coders are now available, some being also compatible with wideband coders. 

The present document will consider only conversational services (that may be mixed with other services) and does not 
cover the streaming-only services. 

Such applications include: 

• Speech and audio communication including conferencing. 

• Bandwidth extension which may allow usage for some mixed content. 

• Superwideband enhancement coupled with stereo/dichotic. 
The send path it can be characterized in two ways: 

• The signal picked up by microphone may combine speech, music and every type of environmental signal. 

• Direct insertion of any type of signal. 
For receive path, signal may be combine two types: 

• Communication signals such as described for send path. 

• Signal coming from distributed applications (e.g. advertisement, music on hold, etc.). 
Handset terminals will not be within the scope of the present document. 



References are either specific (identified by date of publication and/or edition number or version number) or 
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the 
referenced document (including any amendments) applies. 

Referenced documents which are not found to be publicly available in the expected location might be found at 
http ://docbox . etsi . or g/Ref erence . 

NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee 



2 



References 



their long term validity. 



2.1 



Normative references 



The following referenced documents are necessary for the application of the present document. 



[1] 



Recommendation ITU-T P. 501 Amendment 1: "Test signals for use in telephonometry". 



[2] 



Recommendation ITU-T P.10/G.100: "Vocabulary for performance and quality of service". 



[3] 



Recommendation ITU-T P. 58: "Head and torso simulator for telephonometry". 
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[4] Recommendation ITU-T P.581: "Use of head and torso simulator (HATS) for hands-free and 

handset terminal testing". 

[5] Recommendation ITU-T P. 79: "Calculation of loudness ratings for telephone sets". 

[6] Recommendation ITU-T G 71 1-1 (annex D): "Wideband embedded extension for G.71 1 pulse 

code modulation". 

[7] Recommendation ITU-T G.722.1 (annex C): "Low-complexity coding at 24 and 32 kbit/s for 

hands-free operation in systems with low frame loss". 

[8] Recommendation ITU-T G.729.1 (annex E): "G.729-based embedded variable bit-rate coder: An 

8-32 kbit/s scalable wideband coder bitstream interoperable with G.729". 

[9] Recommendation ITU-T G.71 8 (annex B)": "Frame error robust narrow-band and wideband 

embedded variable bit-rate coding of speech and audio from 8-32 kbit/s". 

[10] Recommendation ITU-T G.719: "Low-complexity, full-band audio coding for high-quality, 

conversational applications". 

[11] ETSI ES 202 396-1: "Speech and multimedia Transmission Quality (STQ);Speech quality 

performance in the presence of background noise; Part 1: Background noise simulation technique 
and background noise database". 

[12] ETSI ES 202 739: "Speech and multimedia Transmission Quality (STQ) transmission 

requirements for wideband VoIP terminals (handset and headset) from a QoS perspective as 
perceived by the user" . 

[13] ETSI TS 103 739: "Speech and multimedia Transmission Quality (STQ); Transmission 

requirements for wideband wireless terminals (handset and headset) from a QoS perspective as 
perceived by the user" . 

[14] Recommendation ITU-T P.863: "Perceptual objective listening quality assessment". 

[15] Recommendation ITU-T P.380: "Electro-acoustic measurements on headsets". 

[16] IEC 61260: "Electroacoustics - Octave-band and fractional-octave-band filters". 

[17] Recommendation ITU-T P.800: "Methods for subjective determination of transmission quality". 

[18] Recommendation ITU-T P. 830: "Subjective performance assessment of telephone-band and 

wideband digital codecs". 

[19] Recommendation ITU-T G.722: "7 kHz audio-coding within 64 kbit/s". 

[20] ISO 3: 1973: "Preferred numbers - Series of preferred numbers". 

[21] Recommendation ITU-T G.71 1.1 (annex F): "Wideband embedded extension for G.71 1 pulse code 

modulation" . 

[22] Recommendation ITU-T P.57: "Artificial ears". 

[23] Recommendation ITU-T P. 64: "Determination of sensitivity/frequency characteristics of local 

telephone systems". 

[24] ISO 3745: "Acoustics — Determination of sound power levels and sound energy levels of noise 

sources using sound pressure — Precision methods for anechoic rooms and hemi-anechoic rooms". 

2.2 Informative references 

The following referenced documents are not necessary for the application of the present document but they assist the 
user with regard to a particular subject area. 

[i.l] ISO 532: "Acoustics - Method for calculating loudness level". 
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3 Definitions and abbreviations 

3.1 Definitions 

For the purposes of the present document, the following terms and definitions apply: 

binaural listening (definition found on Internet): both ears are involved for the perception of sound 

dichotic (definition found on Internet): relating to or involving the presentation of a stimulus to one ear that differs in 
some respect (as pitch, loudness, frequency, or energy) from a stimulus presented to the other ear 

diotic (definition found on Internet): pertaining to or affecting both ears (same signal in both ears) 

dual channel mode: audio mode, in which two audio channels with independent programme contents (e.g. bilingual) 
are encoded within one audio bit stream 

fullband bandwidth: 20 Hz - 20 kHz 

stereo mode: audio mode in which two channels forming a stereo pair (left and right) are encoded within one bit stream 
and for which the coding process is the same as for the Dual channel mode 

superwideband: covers at least mono and stereo capabilities 

superwideband bandwidth: transmission of speech with a nominal pass-band wider than 100 - 7 000 Hz, usually 
understood to be 50 - 14 000 Hz (definition from Recommendation ITU-T P. 10 /G.100 [2]) 

3.2 Abbreviations 

For the purposes of the present document, the following abbreviations apply: 



ACR 


Absolute Category Rating 


DRP 


ear Drum Reference Point 


ERP 


Ear reference Point 


EVS 


Enhanced Voice Services 


FB 


Fullband 


GAT 


Group Audio Terminal 


HATS 


Head and Torso Simulator 


MCU 


Multiplexing Control Unit 


MRP 


Mouth Reference Point 


MS 


Mid-sized Stereo 


POI 


Point of Interconnection 


SLR 


Send Loudness Rating 


SWB 


Superwideband 


TCL 


Terminal Echo Loss 



4 Applications and coder considerations 
4.1 Applications 

The following applications are within the scope of the present document: 

• Speech and audio communication including conferencing using high quality hands free systems, for which 

superwideband/fullband coding can better reproduce the audio environment and provide improved quality and 
audio immersion. These applications cover also GATs (Group Audio Terminals) and teleconference systems 
such as "Telepresence". 
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• Bandwidth extension which may allow usage for some mixed content applications where wider bandwidth 
could bring a significant added value for the customer (support of 14 kHz and 20 kHz bandwidth and 
stereo/multichannel capability). 

• Superwideband enhancement coupled with stereo/multichannel to maximize the quality enhancement for the 
customer when the terminal device can support this capability. 

The send path can be characterized in two ways: 

• The signal picked up by microphone(s) may combine speech, music and every type of environmental signal. 

NOTE: For some applications (e.g. journalist reporting) the user should have the possibility to cancel the noise 
environment or to transmit it without degradation. 

• Direct insertion of any type of signal. 

For receive path, signal may combine the two following types: 

• Communication signal such as described for send path. 

• Signal coming from distributed applications (e.g. advertisement, music on hold, etc.). 

4.2 Coder considerations 

As indicated in the scope only coders supporting conversational SWB and FB services are applicable to the present 
document. 

4.2.1 Superwideband (SWB) 



Table 4.1 : Use cases for coders 



Coder Reference 


Speech 


Other signals 


Stereo 


Remark 


Recommendation ITU-T G. 722.1 [7] 
annex C 


X 


X Music 




For low frame loss 


Recommendation ITU-T G. 729.1 [8] 
annex E (extension SWB) 


X 


X background 
noise 
(X) Music 






Recommendation ITU-T G. 71 8 [9] 
annex B 


X 


X Music 






Recommendation ITU-T G.71 1.1 
annexes D [6] and F [21] 


X 


X 


X (annex F) 




Recommendation ITU-T G.722 [19] 
annexes B and D 


X 


X 


X (annex D) 




NOTE: G 722.1 [7] is intended to be used for hand-free application. It is referenced here considering 
that a terminal using this coder may implement a headset function. 



When X is in brackets, it means that the coder is not optimized for this application. 

The following coders are recommended for Superwideband: 

• Recommendation ITU-T G.722. 1 [7] Low-complexity coding at 24 and 32 kbit/s for hands-free operation in 
systems with low frame loss. Annex C 14 kHz mode at 24, 32 and 48 kbit/s. 

The algorithm is recommended for use in hands-free applications such as conferencing where there is a low 
probability of frame loss. It may be used with speech or music inputs. The bit rate may be changed at any 
20-ms frame boundary. New annex C contains the description of a low-complexity extension mode to G.722. 1, 
which doubles the algorithm to permit 14-kHz audio bandwidth using a 32-kHz audio sample rate, at 24, 32, 
and 48 kbit/s. 

Annex C: this annex provides a description of the 14-kHz mode at 24, 32 and 48 kbit/s for this 
Recommendation. 
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• Recommendation ITU-T G.729.1 [8] annex E (extension SWB for G.729.1 [8]). 

This annex provides the high-level description of the higher bit-rate extension of G.729 designed to 
accommodate a wide range of input signals, such as speech, with background noise and even music. 

• Recommendation ITU-T G.718 [9] annex B Superwideband scalable (extension for 
Recommendation ITU-T G.718 [9]). This annex describes a scalable superwideband (SWB, 50-14 000 Hz) 
speech and audio coding algorithm operating from 36 to 48 kbit/s and interoperable with 
Recommendation ITU-T G.718 [9]. 

• Recommendation ITU-T G.711.1 [6] annex D defines the superwideband extension 
Annex F defines the Stereo embedded extension for Recommendation ITU-T G.711.1 [6]. 
"Annex F is intended as a stereo extension to the G.711.1 [6] wideband coding algorithm and its 
superwideband annex D. Compared to discrete two-channel (dual-mono) audio transmission, this stereo 
extension G.711.1 [6] annex F saves valuable bandwidth for stereo transmission. It is specified to offer the 
stereo capability while providing backward compatibility with the monaural core in an embedded scalable 
way. The annex provides very good quality for stereo speech contents (clean speech and noisy speech with 
various stereo sound pickup systems: binaural, MS, etc.), and for most of the conditions it provides 
significantly higher quality than low bitrate dual-mono. For some music contents, e.g. highly reverberated 
and/or with diffuse sound, the algorithm may have some performance limitations and may not perform as good 
as dual-mono codecs, however it achieves the quality of state-of-the-art parametric stereo codecs". 

• Recommendation ITU-T G.722 [19] annex B defines the superwideband extension 

and annex D defines the Stereo embedded extension for Recommendation ITU-T G.722 [19]. 
"Annex B describes a scalable superwideband (SWB, 50-14 000 Hz) speech and audio coding algorithm 
operating at 64, 80 and 96 kbit/s. The Recommendation ITU-T G.722 [19] superwideband extension codec is 
interoperable with Recommendation ITU-T G.722 [19]. The output of the Recommendation ITU-T G.722 [19] 
SWB coder has a bandwidth of 50-14000 Hz". 

"Annex D describes a stereo extension of the wideband codec G.722 and its superwideband extension, G.722 
annex B. It is optimized for the transmission of stereo signals with limited additional bitrate, while keeping full 
compatibility with both codecs. Annex D operates from 64 to 128 kbit/s with four superwideband stereo 
bitrates at 80, 96, 112 and 128 kbit/s and two wideband stereo bitrates at 64 and 80 kbit/s". 

NOTE: The potential future mobile coder EVS (Enhanced Voice Services) should be also considered when 
available. It will be relevant to reconsider the contents of the present document to consider the 
implications of the EVS coder implementation in terminals within the scope of the present document. 
EVS is designed for packet-switched networks/Mobile VoIP and VoLTE is a key target application. 
The key features of EVS are Superwideband speech (32 kHz sampling) with improved speech quality and 
improved music performance. 

A future version of the present document will take into account this coder when available. 

4.2.2 Fullband (FB) 

The following coder is recommended for fullband: 

• Recommendation ITU-T G.719 [10] Low-complexity, fullband audio coding for high-quality, 
conversational applications. 

"Recommendation ITU-T G.719 [10] describes the G.719 [10] coding algorithm for low-complexity fullband 
conversational speech and audio, operating from 32 kbit/s up to 128 kbit/s". 

The encoder input and decoder output are sampled at 48 kHz. The codec enables fullbandwidth, from 20 Hz to 20 kHz, 
encoding of speech, music and general audio content. The codec operates on 20-ms frames and has an algorithmic delay 
of 40 ms. 

NOTE: Amendment 1 adds new annex A that specifies the use of the ISO base media file format as container for 
the G.719 [10] bitstream addresses non-conversational use cases of the codec (e.g. call waiting music 
playback and recording of teleconferencing sessions, voice mail messages, online "jam" -sessions). 
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5 Test considerations 

The terminals within the scope of the present document are not only dedicated to speech communication but are also 
mixing speech and audio contents and may implement stereo and multichannel transmissions. As a consequence there is 
a need to define new parameters, such as: 

• Loudness: Loudness Rating is determined only for speech or speech-like signals. Loudness may be calculated 
over any types of signals (audio sequences, speech sequences and mix of these sequences). Moreover it is not 
intended to define Loudness Rating algorithms for Superwideband and fullband speech. To be consistent with 
transmission planning, the loudness rating shall be determined using wideband calculation and loudness shall 
be measurement for all the bandwidths. Clause 5.4.1.2 details the measurement principles. 

• Binaural listening: The most of the test assessment methods and requirements for speech terminals are based 
on monaural listening, Even if some of them (e.g. for Handsfree Loudness rating) are intended to take into 
account binaural listening, the basic methods and requirements are only taking into account correction factors. 
The plan is to adapt test methods to effective binaural listening. 

As a consequence, the present document takes into account test arrangements that are defined for speech terminals or 
for audio equipment. 

Recommendation ITU-T P 58 [3] give informations about use of HATS only from 100 Hz to 10 kHz, but new designs 
offer wider bandwidths. 

For send the HATS can be used between 50 Hz and 16 kHz. Until the development of new systems with larger 
bandwith, send measurement will be limited to those frequencies. 

NOTE 1 : With some measurement equipment the use of such of bandwidth is not possible and should be limited to 
100 Hz to 14 kHz. 

For receive, a correction factor (given, in annex B) allows measurement at DRP until 16 kHz. 

NOTE 2: It is not the intention of the present document to define new requirements to adapt HATS for 

superwideband and fullband. However when terminals implement Superwideband or Fullband within 
terminals support also WideBand and/or NarrowBand speech, it is intended to use as far as possible test 
methods defined for wideband terminals and consequently to use HATS for parameters measured in 
wideband bandwidth. 

5.1 Test Setup 
5.1.1 Setup for terminals 

The preferred acoustical access to terminals is the most realistic simulation of the "average" subscriber. This can be 
made by using Head And Torso Simulator (HATS) with appropriate ear simulation and appropriate means to fix 
handset and headset terminals in a realistic and reproducible way to the HATS. HATS is described in 
Recommendation ITU-T P. 58 [3], appropriate ears are described in Recommendation ITU-T P. 57 [22] (type 3.3 and 
type 3.4 ear), a proper positioning of handsets under realistic conditions is to be found in 
Recommendation ITU-T P.64 [23]. 

The preferred way of testing a terminal is to connect it to a network simulator with exact defined settings and access 
points. The test sequences are fed in either electrically, using a reference codec or using the direct signal processing 
approach or acoustically using ITU-T specified devices. 
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Figure 5.1 : Half channel terminal measurement 



5. 1 .2 Setup for headsets 



The artificial ear shall be conform with Recommendation ITU-T P. 57 [22], type 3.3 or type 3.4 ears shall be used. 

Recommendations for positioning headsets are given in Recommendation ITU-T P. 3 80 [15]. If not stated otherwise 
headsets shall be placed in their recommended wearing position. Further information about setup and the use of HATS 
can be found in Recommendation ITU-T P. 3 80 [15]. 

Unless stated otherwise if a volume control is provided the setting is chosen such that the nominal RLR is met as close 
as possible. 



5.1 .3 Position and calibration of HATS 

All the send and receive characteristics shall be tested with the HATS, it shall be indicated what type of ear was used. 
The horizontal positioning of the HATS reference plane shall be guaranteed within ±2°. 

The HATS shall be equipped with two type 3.3 or type 3.4 artificial ears. For binaural headsets two artificial ears are 
required. The type 3.3 or type 3.4 artificial ears as specified in Recommendation ITU-T P. 57 [22] shall be used. The 
artificial ear shall be positioned on HATS according to Recommendation ITU-T P. 58 [3]. 

The exact calibration and equalization can be found in Recommendation ITU-T P. 581 [4]. 

For calibration of mouth, equalization has to be limited between 50 Hz and 16 kHz. 

NOTE: With some measurement equipment the use of such of bandwidth is not possible and shall be limitated to 
100 Hz - 14 kHz. 

For receive if not stated otherwise, the HATS shall be corrected using the correction factor given in annex A. 

5.1.4 Test signal 

The test signals are defined according to Recommendation ITU-T P. 501 [1]. 
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As the bandwidth of the speech signals defined in Recommendation ITU-T P.501 [1] is fullband, these test signals shall 
be used in the present document: 

• The test signal to be used for measurements such as Frequency response, Loudness Rating, shall be the 
British-English single talk sequence described in clause 7.3.2 of Recommendation ITU-T P.501 [1]. 

• The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 
Recommendation ITU-T P.501 [1], shall be used as activation signal for measurements such as distortion, send 
noise. 

• The compressed real speech signal described in clause 7.3.3 of 

Recommendation ITU-T P.501 Amendment 1 [1], shall be used for measurements such as TCLw. 

For double-talk performance: 

• A "double-talk" sequence representing typical double talk scenarios in real conversations is shown in 

figure 63.4. This uses the single-talk sequence described in section 7.3.1 of Recommendation ITU-T P.501 [1], 
shown in the lower pane, as the main speech and an additional competing speaker sequence, shown in the 
upper pane. 

5.1.5 Test signal levels 

The level dependency should be considered and consequently tests should also be done with signal levels lower and 
higher than the reference level defined in clauses 5.1.5.1 and 5.1.5.2. 

5.1.5.1 Send 

Unless specified otherwise, the applied test signal level shall be -4,7 dBPa. 

5.1.5.2 Receive 

Unless specified otherwise, the applied test signal level at the digital input shall be -16 dBmO. 

5.1 .6 Setup of background noise simulation 

A setup for simulating realistic background noises in a lab-type environment is described in ES 202 396-1 [11]. 
The signals attached to ES 202 396-1 [11] are fullband signals and should be used for background noise simulation. 

5.1.7 Acoustic environment 

NOTE: The acoustic environment may influence more significantly the results in low and high frequencies. It 
should be adapted to the terminal bandwidth. 

In general two possible approaches need to be taken into account: either room noise and background noise are an 
inherent part of the test environment or room noise and background noise shall be eliminated to such an extent that their 
influence on the test results can be neglected. 

Unless stated otherwise measurements shall be conducted under quiet and "anechoic" conditions. 

In cases where real or simulated background noise is used as part of the testing environment, the original background 
noise shall not be noticeably influenced by the acoustical properties of the room. 

In all cases where the performance of acoustic echo cancellers shall be tested, a realistic room, which represents the 
typical user environment for the terminal shall be used. 

5.1 .8 Influence of terminal delay issue for measurements 

As delay is introduced by the terminal, care shall be taken for all measurement using an activation signal. It shall be 
checked that the test is performed on the test signal and not on the activation signal. 
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5.2 Environmental conditions for tests 

The following conditions shall apply for the testing environment: 

a) Ambient temperature: 15 °C to 35 °C (inclusive); 

b) Relative humidity: 5 % to 85 %; 

c) Air pressure: 86 kPa to 106 kPa (860 mbar to 1 060 mbar). 

d) Unless specified otherwise, the background noise level shall be less than -64 dBPa(A) in conjunction with 
NC30 (ISO 3745 [24]). 

For specified tests, it is desirable to have a background noise level of less than -74 dBPa(A) in conjunction 
with NC20, but the background noise level of -64 dBPa(A) in conjunction with NC30 shall never be exceeded. 
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Figure 5.2: NC-criteria for test environment 
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5.3 Accuracy of measurements and test signal generation 

Unless specified otherwise, the accuracy of measurements made by test equipment shall be equal to or better than: 

Table 5.1 : Measurement Accuracy 



Item 


Accuracy 


Electrical signal level 


±0,2 dB for levels > -50 dBV 
±0,4 dB for levels < -50 dBV 


Sound pressure 


±0,7 dB 


Frequency 


±0,2 % 


Time 


±0,2 % 
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Unless specified otherwise, the accuracy of the signals generated by the test equipment shall be better than: 



Table 5.2: Accuracy of test signal generation 



Quantity 


Accuracy 


Sound pressure level 


+0/-6 dB for frequencies from 50 Hz 100 Hz 
±1 dB for frequencies from 100 Hz to 8 000 Hz 
±3 dB for frequencies from 8 000 Hz to 1 6 000 Hz 


Electrical excitation levels 


±0,4 dB across the whole frequency range 


Frequency generation 


±2% 


Time 


±0,2 % 


Specified component values 


±1 % 


NOTE: This tolerance may be used to avoid measurements at critical frequencies, e.g. those 
due to sampling operations within the terminal under test. 



NOTE: With some measurement equipment the use of such of bandwidth is not possible and should be limited to 
100 Hz- 14 kHz. 

For terminal equipment which is directly powered from the mains supply, all tests shall be carried out within ±5 % of 
the rated voltage of that supply. If the equipment is powered by other means and those means are not supplied as part of 
the apparatus, all tests shall be carried out within the power supply limit declared by the supplier. If the power supply is 
a.c. the test shall be conducted within ±4 % of the rated frequency. 

5.4 Specific test considerations 

Even if the present document is dedicated to conversational services, the signals that are transmitted may combine 
speech and audio. 

5.4.1 Loudness rating and Loudness 

5.4.1 .1 Loudness Rating 

Loudness Rating, as defined in Recommendation ITU-T P.79 [5], applies for narrowband and wideband and is specific 
to telecommunications transmission systems. So, when terminals implement wideband speech or are intended to 
communicate with wideband or narrowband terminals, the terminals shall be calibrated for SLR and RLR values. 

Due to the current bandwidth limitation of loudness rating's calculation it is not possible to calculate superwidband of 
fullband loudness ratings. 

NOTE: For RLR and SLR, values are similar or derived from those defined in ES 202 739 [12] and 
TS 103 739 [13]. 

5.4.1.2 Loudness 

Loudness is quantifies the level perceived by the user and should be more relevant when the signal combines speech 
and audio sequences. ISO 532 [i.l] method B defines a standardized way to determine the loudness of a steady-state 
complex signal. 

This assessment method takes into account the level, the spectrum of the signals and takes into account binaural 
listening. Loudness may be calculated for any type of signal (speech, music, noise) and mixed signals. 

Standardized audio and speech signals (possibly based on combination of sequences defined in 
Recommendation ITU-T P.501 [1] and in ES 202 396-1 [11]. 

When the terminal provides superwideband or fullband in addition with wideband or narrowband, the reference 
loudness value (expressed in phones) shall be determined for Narrowband or Wideband transmission. 

The loudness measured in superwideband or fullband should be equal and preferably higher than the loudness value 
measured for narrowband or wideband. 
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If the superwideband and fullband terminal does not support wideband transmission, standardized loudness levels have 
to be defined (for further study). 

5.4.2 Binaural listening 

The scope of the present document includes terminals that may have two earpieces, distant sound pick-up using two or 
more microphones. 

The terminal may also provide stereo listening or binaural rendering built from MCU. 
So it should be relevant to consider binaural listening (for further study). 



6 Requirements considerations and associated 
Measurement Methodologies 

When possible, parameter requirements will be derived from requirements defined for the wideband terminals. The 
recommended test method is also provided in the same clause as requirements. 

6.1 Send parameters 
6.1.1 Send Frequency response 

Requirement 

The send frequency response of the headset shall be within a mask as defined in table 6.1 for SWB and table 6.2 for FB, 
and shown in figure 6.1 for SWB and figure 6.2 for FB. This mask shall be applicable for all types of headsets. 

Table 6.1 : Superwideband send frequency response limits 



Frequency 


Upper Limit 


Lower Limit 


100 Hz 


5dB 


-5dB 


12 500 Hz 


5dB 


-5dB 


14 000 Hz 


5dB 


-10dB 


NOTE: The limits for intermediate frequencies lie 
on a straight line drawn between the 
given values on a linear (dB) - 
logarithmic (Hz) scale. 
The requirement is based on 1/1 2 th 
octave measurement. 
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Send frequency Mask for Super Wide Band 
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Figure 6.1 : Send frequency response mask for superwideband 



Table 6.2: Fullband send frequency response limits 



Frequency 


Upper Limit 


Lower Limit 


100 Hz 


5dB 


-5dB 


12 500 Hz 


5dB 


-5dB 


14 000 Hz 


5dB 


-5dB 


NOTE: The limits for intermediate frequencies lie 
on a straight line drawn between the 
given values on a linear (dB) - 
logarithmic (Hz) scale. 
The requirement is based on 1/1 2 th 
octave measurement. 
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Send frequency Mask for Full Band 
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Figure 6.2: Send frequency response mask for Fullband 

NOTE 1 : The basis for the target frequency responses in sending and receiving is the orthotelefonic reference 
response which is measured between 2 subjects in 1 m distance under free field conditions and is 
assuming an ideal receive characteristic. Under these conditions the overall frequency response shows a 
rising slope. In opposite to other standards the present document no longer uses the ERP as the reference 
point for receiving but the free-field. With the concept of free-field based receive measurements a rising 
slope for the overall frequency response is achieved by a flat target frequency response in sending and a 
free field based receiving frequency response. 

NOTE 2: A "balanced" frequency response is preferable from the perception point of view. If frequency 

components in the low frequency domain are attenuated in a similar way frequency components in the 
high frequency domain should be attenuated. 



Measurement Method 



The test signal is defined in clause 5.1.4. The spectrum of acoustic signal produced by the artificial mouth is calibrated 
under free field conditions at the MRP. The test signal level shall be -4,7 dBPa. 
The headset terminal is setup as described in clause 5.1. 

The tests are repeated 5 times, in conformance with Recommendation ITU-T P. 380 [15]. The results are averaged 
(averaged value in dB, for each frequency). 

Measurements shall be made at one twelfth-octave intervals as given by the R.40 series of preferred numbers in 
ISO 3 [20] for frequencies from 100 Hz to 14 kHz inclusive. For the calculation the averaged measured level at the 
electrical reference point for each frequency band is referred to the averaged test signal level measured in each 
frequency band at the MRP. 

The sensitivity is expressed in terms of dBV/Pa. 



6.1 .2 Send Loudness Rating 

Requirement: 

The nominal value of Send Loudness Rating (SLR) shall be: 
• SLR(set) = 8 dB + 3 dB 
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Measurement Method: 

The test signal to be used for the measurements shall be the British-English single talk sequence described in 
clause 7.3.2 of Recommendation ITU-T P. 501 Amendment 1 [1] shall be used. The spectrum of acoustic signal 
produced by the artificial mouth is calibrated under free field conditions at the MRP. The test signal level shall be 
-4,7 dBPa, measured at the MRP. The test signal level is averaged over the complete test signal sequence. 

The headset terminal is setup as described in clause 5.1. 

The tests are repeated 5 times, in conformance with Recommendation ITU-T P. 380 [15]. The results are averaged 
(averaged value in dB, for each frequency). 

The sending sensitivity shall be calculated from each band of the 20 frequencies given in table 1 of 
Recommendation ITU-T P. 79 [5], bands 1 to 20. For the calculation the averaged measured level at the electrical 
reference point for each frequency band is referred to the averaged test signal level measured in each frequency band at 
the MRP. 

The sensitivity is expressed in terms of dB V/Pa and the SLR shall be calculated according to 
Recommendation ITU-T P.79 [5] see annex A. 

6.1 .3 Level dependency for SLR 

Requirement: 

For further study. 
Measurement Method: 

The loudness/loudness ratings are tested for different input levels (at least the nominal signal level, a 10 dB lower and a 
5 dB higher). 

Same method as SLR with adapted level. 

6.1.4 Send Distortion 

6.1 .4.1 Signal to harmonic distortion versus frequency 

Requirement: 

The ratio of signal to harmonic distortion shall be above the following mask: 



Table 6.3: Send distortion for superwideband 



Frequency 


Ratio 


100 Hz 


24 dB 


200 Hz 


26 dB 


400 Hz 


30 dB 


1 kHz 


30 dB 


2 kHz 


30 dB 


3,15 kHz 


30 dB 


5 kHz 


30 dB 


NOTE: Limits at intermediate frequencies lie on a straight line drawn 
between the given values on a linear (dB ratio) - logarithmic 
(frequency) scale. 
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Table 6.4: Send distortion for fullband 



Frequency 


Ratio 


100 Hz 


26 dB 


200 Hz 


30 dB 


400 Hz 


30 dB 


1 kHz 


30 dB 


2 kHz 


30 dB 


3,15 kHz 


30 dB 


8 kHz 


30 dB 


NOTE: Limits at intermediate frequencies lie on a straight line drawn 
between the given values on a linear (dB ratio) - logarithmic 
(frequency) scale. 



Measurement method: 

The headset terminal is set-up as described in clause 5.1. 

The signal used is an activation signal followed by a sine- wave signal with a frequency at 100 Hz, 315 Hz, 400 Hz, 
500 Hz, 630 Hz, 800 Hz, 1 000 Hz, 2 000 Hz, 3 150 Hz and 7 000 Hz for superwideband and 100 Hz, 315 Hz, 400 Hz, 
500 Hz, 630 Hz, 800 Hz, 1 000 Hz, 2 000 Hz, 3 150 Hz and 8 000 Hz for fullband. The duration of the sine wave shall 
be less than 1 s. The sinusoidal signal level shall be calibrated to -4,7 dBPa at the MRP. 

The signal to harmonic distortion ratio is measured selectively up to 14 kHz for superwideband and 20 kHz for 
fullband. 

The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 

Recommendation ITU-T P. 501 [1] shall be used for activation. The level of this activation signal is -4,7 dBPa at the 
MRP. 

NOTE: Depending on the type of codec the test signal used may need to be adapted. 

6.1 .4.2 Signal to harmonic distortion for higher input level 
Requirement: 

For the signal defined in the measurement method, the signal to harmonic distortion ratio shall be > 30 dB. 
Measurement method: 

The headset terminal is set-up as described in clause 5.1. 

The signal used is an activation signal followed by a 1 kHz sine wave. The signal to harmonic distortion ratio is 
measured selectively up to 14 kHz for Superwideband terminals and up to 20 kHz for fullband terminal. 

The duration of the sine wave shall be < 1 s. The sinusoidal signal level shall be calibrated to +10 dBPa at the MRP. 

The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 

Recommendation ITU-T P.501 [1] shall be used for activation. The level of this activation signal is -4,7 dBPa at the 
MRP. 

NOTE: Depending on the type of codec the test signal used may need to be adapted. 

6.1.5 Send Noise 

Requirement: 

The maximum noise level produced by the VoIP terminal at the POI under silent conditions in the sending direction 
shall not exceed -68 dBmO (A). 

No peaks in the frequency domain higher than 10 dB above the average noise spectrum shall occur. 
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Measurement Method: 

For the actual measurement no test signal is used. In order to reliably activate the terminal an activation signal is 
introduced before the actual measurement. 

The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 

Recommendation ITU-T P. 501 [1] shall be used for activation. The level of this activation signal is -4,7 dBPa at the 
MRP. 

The headset terminal is set-up as described in clause 5.1. 

The send noise is measured at the POI in the frequency range from 50 Hz to 14 kHz for Superwideband and 20 Hz to 
20 kHz for Fullband. The analysis window is applied directly after stopping the activation signal but taking into account 
the influence of all acoustical components (reverberations). The averaging time is 1 s. The test house has to ensure 
(e.g. by monitoring the time signal) that during the test the terminal remains in activated condition. If the terminal is 
deactivated during the measurement, the measurement time has to be reduced to the period where the terminal remains 
in activated condition. 

The noise level is measured in dBm0(A). 

6.2 Receive parameter 

6.2.1 Equalization 

This type of terminal may be used for reproduction of signals other than pure speech (e.g. music) for which user's 
preference may be different in term of sound signature. 

So the terminal may implement an equalization function adjusting frequency response according to user's preferences. 

When such a function is available it is necessary that receive frequency response be conform to requirement of 
clause 6.2.2 from the present document, for at least one setting. 

For all settings conformance with other parameters of the present document shall be ensured. 

6.2.2 Receive Frequency response 

Requirement: 

The receive frequency response of the headset shall be within a mask as defined in table 6.5 and shown in figure 6.3. 

NOTE 1: For the time being, the measurement method defined (with correction factor given in annex B) being only 
valid until 16 kHz, the requirements for superwideband and for fullband are identical. 



Table 6.5: Receive Frequency Response limits 



Frequency 


Upper Limit 


Lower Limit 


50 Hz 


3dB 


-5dB 


400 Hz 


3dB 


-5dB 


1010 Hz 


* 


-5dB 


1 200 Hz 


* 


-8dB 


1 500 Hz 


* 


-8dB 


2 000 Hz 


9dB 


-3dB 


3 200 Hz 


9dB 


-3dB 


14 000 Hz 


9dB 


-13dB 


NOTE: * The limit curves shall be determined by straight lines joining successive 
co-ordinates given in the table, where frequency response is plotted on a 
linear dB scale against frequency on a logarithmic scale. It is a floating or 
"best fit" mask. The requirement is based on 1/1 2 th octave measurement. 
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Figure 6.3: Receive frequency response mask for Superwideband 

NOTE 2: This requirement applies to headphones not primarily designed for superwideband communication but 
rather for music audition. It is the reason of rather open limits. In the next future, new limits will be 
discussed to apply when specially designed for superwideband headphones will be available. 

Measurement Method: 

Receive frequency response is the ratio of the measured sound pressure and the input level. 
(dB relative Pa/V). 

S Jeff = 20 log (pe ff / v RCV ) dB rel 1 Pa / V 
Receive Sensitivity; Junction to HATS Ear with free field correction. 



>Jeff 



P e Jf 



DRP Sound pressure measured by ear simulator Measurement data are converted from the Drum 
Reference Point to free field. 



v RCV Equivalent RMS input voltage. 

The test signal to be used for the measurements is defined in clause 5.1.4. 
The headset terminal is setup as described in clause 5.1. 

The sound pressure level is measured at the DRP of the HATS for each 1/1 2 th octave band. 

The tests are repeated 5 times, in conformance with Recommendation ITU-T P. 380 [15]. The results are averaged 
(averaged value in dB, for each frequency). 

Measurements shall be made at one twelfth-octave intervals as given by the R.40 series of preferred numbers in 
ISO 3 [20] for frequencies from 50 Hz to 16 kHz inclusive. For the calculation the averaged measured level at each 
frequency band is referred to the averaged test signal level measured in each frequency band. 

The obtained response curve is corrected by the correction factor given in annex B. 

The sensitivity is expressed in terms of dBPa/V. 
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6.2.3 Receive Loudness Rating (monaural reproduction) 

Requirement: 

When terminal implements Wideband speech functions or when the superwideband/fullband functions may interact 
with wideband terminals, the headset terminal shall fulfil the requirements on RLR as defined in ES 202 739 [12], 
clause 7.1.7. 

Measurement Method: 

The test signal to be used for the measurements shall be British-English single talk sequence described in clause 7.3.2 of 
Recommendation ITU-T P. 501 [1] shall be used. The test signal level shall be -16 dBmO, measured at the digital 
reference point or the equivalent analogue point. The test signal level is averaged over the complete test signal 
sequence. 

The headset terminal is setup as described in clause 5.1. 

The tests are repeated 5 times, in conformance with Recommendation ITU-T P. 380 [15]. The results are averaged 
(averaged value in dB, for each frequency). 

The receiving sensitivity shall be calculated from each band of the 20 frequencies given in table 1 of 
Recommendation ITU-T P. 79 [5], bands 1 to 20. For the calculation the averaged measured level at each frequency 
band is referred to the averaged test signal level measured in each frequency band. 

The sensitivity is expressed in terms of dBPa/V and the RLR shall be calculated according to 
Recommendation ITU-T P.79 [5], see annex A. No leakage correction shall be applied for the measurement. 

6.2.4 RLR for stereo/dichotic 

For further study. 

6.2.5 Loudness 

For further study. 

6.2.6 Receive Distortion 

Requirement: 

The ratio of signal to harmonic distortion shall be above the following mask: 



Table 6.6: Receive distortion for superwideband 



Frequency 


Signal to distortion ratio limit, receiving 


100 Hz 


24 dB 


315 Hz 


26 dB 


400 Hz 


30 dB 


500 Hz 


30 dB 


800 Hz 


30 dB 


1 kHz 


30 dB 


2 kHz 


30 dB 


3,15 kHz 


30 dB 


5 kHz 


30 dB 


NOTE: Limits at intermediate frequencies lie on a straight line drawn 
between the given values on a linear (dB ratio) - logarithmic 
(frequency) scale. 



ETSI 



24 



ETSI TS 102 924 V1.1.1 (2013-03) 



Table 6.7: Receive distortion for fullband 



Frequency 


Signal to distortion ratio limit, receiving 


(- r\ 1 1 _ 

50 Hz 


24 dB 


1 00 Hz 


26 dB 


315 Hz 


30 dB 


400 Hz 


30 dB 


500 Hz 


30 dB 


800 Hz 


30 dB 


1 kHz 


30 dB 


2 kHz 


30 dB 


3,15 kHz 


30 dB 


8 kHz 


30 dB 


NOTE: Limits at intermediate frequencies lie on a straight line drawn 
between the given values on a linear (dB ratio) - logarithmic 
(frequency) scale. 



Measurement Method: 

The headset terminal is positioned as described in clause 5.1. 

The signal used is an activation signal followed by a sine- wave signal with a frequency at 100 Hz, 315 Hz, 400 Hz, 
500 Hz, 630 Hz, 800 Hz, 1 000 Hz, 2 000 Hz, 3 150 Hz and 7 000 Hz for superwideband and 50 Hz, 100 Hz, 315 Hz, 
400 Hz, 500 Hz, 630 Hz, 800 Hz, 1 000 Hz, 2 000 Hz, 3 150 Hz and 8 000 Hz for fullband. 

The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 
Recommendation ITU-T P. 501 [1] shall be used for activation. The signal level shall be -16 dBmO. 

The signal to harmonic distortion ratio is measured selectively up to 14 kHz for superwideband and 20 kHz for 
fullband. 

The ratio of signal to harmonic distortion shall be measured at the DRP of the artificial ear with a correction by the 
curve of reference microphone. 

NOTE: Depending on the type of codec the test signal used may need to be adapted. 

6.2.7 Minimum activation level and sensitivity in Receive direction 

For further study. 

6.2.8 Receive Noise 

Requirement: 

Telephone sets with adjustable receive levels shall be adjusted so that the RLR is as close as possible to the nominal 
RLR. 

The receive noise shall be less than -57 dBPa(A). 

Where a volume control is provided, the measured noise shall not be greater than -54 dBPa(A) at the maximum setting 
of the volume control. 

Measurement Method: 

The headset terminal is setup as described in clause 5.1. 

The female speaker signal of the short conditioning sequence described in clause 7.3.7 of 

Recommendation ITU-T P. 501 [1] shall be used for activation. The activation signal level shall be -16 dBmO. 

The measurement shall be A- weighted. 

The noise shall be measured at the DRP of the artificial ear with a correction by the curve of reference microphone. 
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6.2.9 Automatic Gain Control in Receiving 

For further study. 

6.3 Other parameters 

6.3.1 Sidetone Masking Rating STMR (Mouth to ear) 

Requirement: 

The STMR shall be 16 dB + 4 dB for nominal setting of the volume control. 

For all other positions of the volume control, the STMR shall not be below 8 dB. 

NOTE 1 : It is preferable to have a constant STMR independent of the volume control setting. 

NOTE 2: STMR measurement in P. 79 [5] is not defined above 8 kHz, but sidetone signal is not supposed to have 
such limitation. 

Measurement Method: 

The test signal is defined in 5.1.4. The test signal level shall be -4,7 dBPa, measured at the MRP. The headset terminal 
is setup as described in clause 5.1. 

Where a user operated volume control is provided, the measurements shall be carried out the nominal setting of the 
volume control. In addition the measurement is repeated at the maximum volume control setting. 

Measurements shall be made at one twelfth-octave intervals as given by the R.40 series of preferred numbers in 
ISO 3 [20] for frequencies from 100 Hz to 8 kHz inclusive. For the calculation the averaged measured level at each 
frequency band (Recommendation ITU-T P. 79 [5] table 3, bands 1 to 20) is referred to the averaged test signal level 
measured in each frequency band. 

The Sidetone path loss (LmeST), as expressed in dB, and the SideTone Masking Rate (STMR) (in dB) shall be 
calculated from the formula 5-1 of Recommendation ITU-T P.79 [5], using m = 0,225 and the weighting factors of in 
table 3 of Recommendation ITU-T P.79 [5]. 

6.3.2 Sidetone delay 

Requirement: 

The maximum sidetone-round-trip delay shall be < 5 ms, measured in an echo-free setup. 
Measurement Method: 

The headset terminal is setup as described in clause 5.1. 

The test signal is a CS-signal complying with Recommendation ITU-T P.501 [1] using a PN sequence with a length of 
4 096 points (for the 48 kHz sampling rate) which equals to the period T. The duration of the complete test signal is as 
specified in Recommendation ITU-T P.501 [1]. 

The level of the signal shall be -4,7 dBPa at the MRP. 

The cross-correlation function Oxy(x) between the input signal S x (t) generated by the test system in send direction and 
the output signal S y (t) measured at the artificial ear is calculated in the time domain: 

°>H™ T £s x {t)s y {t+T) CD 

The measurement window T shall be exactly identical with the time period T of the test signal, the measurement 
window is positioned to the pn-sequence of the test signal. 
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The sidetone delay is calculated from the envelope E(x) of the cross-correlation function Oxy(x). The first maximum of 
the envelope function occurs in correspondence with the direct sound produced by the artificial mouth, the second one 
occurs with a possible delayed sidetone signal. The difference between the two maxima corresponds to the sidetone 
delay. The envelope E(t) is calculated by the Hilbert transformation H {xy(x)} of the cross-correlation: 



Requirement 

With the headset lying on and the transducers facing a hard surface, the attenuation from the digital input to the digital 
output shall be at least 6 dB at all frequencies in the range of 50 Hz to 16 kHz for superwideband and 20 to 20 kHz for 
Fullband. The requirement applies for the closest possible position between microphone and headset receiver. 

NOTE: Depending on the type of headset it may be necessary to repeat the measurement in different positions. 

Measurement method: 

Before the actual test a training sequence of the British-English single talk sequence described in clause 7.3.2 of 
Recommendation ITU-T P.501 [1] is applied. The training sequence level shall be -16 dBmO in order not to overload 
the codec. 

The test signal is a PN sequence complying with Recommendation ITU-T P.501 [1] with a length of 4 096 points (for 
the 48 kHz sampling rate) and a crest factor of 6 dB. The duration of the test signal is 250 ms. With an input signal of 
-3 dBmO, the attenuation from digital input to digital output shall be measured for frequencies from 100 Hz to 8 kHz 
under the following conditions: 

a) the headset, with the transmission circuit fully active, shall be positioned on one inside surface that is of three 
perpendicular planes, smooth, hard surfaces forming a corner. Each surface shall extend 0,5 m from the apex 
of the corner. One surface shall be marked with a diagonal line, extending from the corner formed by the three 
surfaces, and a reference position 250 mm from the corner, as shown in figure 4; 

b) the headset, with the transmission circuit fully active, shall be positioned on the defined surface as follows: 




(2) 




(3) 



It is assumed that the measured sidetone delay is less than T/2. 



6.3.3 Stability loss 



1) the microphone and the receiver shall face towards the surface; 



2) the headset receiver shall be placed centrally at the reference point as shown in figure 6.4; 



3) the headset microphone is positioned as close as possible to the receiver. 
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NOTE: All dimensions in mm. 

Figure 6.4 



6.3.4 Round-trip Delay 

The round trip delay includes send delay plus receive delay. A maximum acceptable value has to be defined in the next 
release of the present document. 

Codec delay is not included. 

For conversational services, the delay shall be kept as small as possible to ensure a good quality and in particular a high 
level of interactivity between the users. 

6.3.5 Terminal Echo Loss 

Requirement: 

The TCL measured as unweighted Echo Loss shall be > 46 dB for all positions of the volume control (if supplied). 
Measurement method: 

The headset terminal is setup as described in clause 5.1. The ambient noise level shall be < -64 dBPa(A). The 
attenuation from electrical reference point input to electrical reference point output shall be measured using the 
compressed real speech signal described in clause 7.3.3 of Recommendation ITU-T P. 501 Amendment 1 [1]. 

TCL is calculated as unweighted echo loss from 100 Hz to 8 kHz. For the calculation the averaged measured echo level 
at each frequency band is referred to the averaged test signal level measured in each frequency band. The first 17,0 s of 
the test signal (6 sentences) are discarded from the analysis to allow for convergence of the acoustic echo canceller. The 
analysis is performed over the remaining length of the test sequence (last 6 sentences). 

6.3.6 Objective listening quality 

For further study "Recommendation ITU-T P. 863 [14] describes an objective method for predicting overall listening 
speech quality from narrowband (300 Hz to 3 400 Hz) to superwideband (50 Hz to 14 000 Hz) telecommunication 
scenarios as perceived by the user in an Recommendation ITU-T P. 800 [17] or Recommendation ITU-T P. 830 [18] 
ACR listening only test" . 

NOTE: Particular attention has to be taken for the choice of appropriate test sequence. 
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6.3.7 Double talk performance 



To assess double talk performance, the signals to be used are defined in Recommendation ITU-T P. 501 [1]: 
A "double-talk" sequence representing typical double talk scenarios in real conversations is shown in figure 63.4. This 
uses the single-talk sequence described in section 7.3.1 of Recommendation ITU-T P.501 [1], shown in the lower pane, 
as the main speech and an additional competing speaker sequence, shown in the upper pane. 



Function: 



Double-talk 
(cross-hatch) 



Double-Talk Sequence 
bed 

i : ' : ! ; ; ~ — r - 




F3 gj F3 M2 M2 Ml M2 F3 , : M4 : : Fl : : M4 : Fl : , 




Time (s) 



Figure 6.5: Double-talk test sequence using the single-talk sequence and competing speech serving 
different functions (a - e). Cross-hatched areas between the upper and lower panes show periods of 

double talk. 

The competing-speaker sequence includes single words (the word "five") spoken by speakers F3 and M2 during the first 
half of the sequence followed by full sentences by speakers Fl and M4 during the second half of the sequence. No 
speaker is competing with themselves during the sequence. 

The competing samples serve different double-talk functions, defined as functions "a" to "e" above the upper pane of 
figure 6.5. The functions are: 

a) competing word within a speech pause; 

b) competing word partially masked; 

c) competing word fully masked within a sentence; 

d) competing word fully masked coincident with the start of a sentence; 

e) sentence masking another sentence. 

These are meant to represent possible double-talk situations in normal conversation. The area between the upper and 
lower pane of figure 6.5 shows the periods during which double-talk happens as cross-hatched patches. The competing 
sequence can be used either as a send signal or a receive signal in testing. 



6.3.8 Speech and audio quality in presence of noise 

For further study. 
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Annex A (normative): 

Correction factor used for measuring superwideband 
headset using HATS 

A correction factor has to be added to the acoustic level measured at the DRP of the 3.3 or 3.4 ear. 
This correction factor is given for l/3 rd octave measurement and for 1/1 2 th octave measurements. 
For 1/1 2 th octave measurements, two tables are given, corresponding to two sets of frequencies. 

The first one are the 1/12 octave centre frequencies specified in IEC 61260 [16], the second one are center frequencies 
corresponding to 1/1 2 th octave intervals as given by the R.40 series of preferred numbers in ISO 3 [20]. 



A. 1 1 /3 octave correction 



Table A.1 : 1/3 octave correction 



Freq. Hz 


Rep. dB 


Freq. Hz 


Rep. dB 


Freq. Hz 


Rep. dB 


Freq. Hz 


Rep. dB 


I 1 6 





100 





630 


2 


4 000 


12,66 


| 20 





125 





800 


4 


5 000 


9,94 


25 





160 





1 000 


5 


6 300 


5,59 


31 





200 





1 250 


6,5 


8 000 


8,77 


! 40 





250 


0,5 


1 600 


8 


10 000 


6,53 


I 50 





315 


0,5 


2 000 


11,78 


12 500 


-0,20 


| 63 





400 


1 


2 500 


14,06 


16 000 


-0,52 


80 





500 


1,5 


3 150 


13,44 







A.2 1/1 2 Tn octave correction 



Table A.2: 1/1 2 octave correction using centre frequencies specified in IEC 61260 [16] 



Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 




I 19 





61,3 





193 





613 


1,94 


1 939 


11,17 


6 131 


5,5 | 


20,5 





64,9 





205 


0,06 


649 


2,25 


2 053 


12,06 


6 494 


4,95 


21,8 





68,8 





218 


0,19 


688 


2,74 


2 175 


12,91 


6 879 


4,9 


23 





72,9 





230 


0,31 


729 


3,22 


2 304 


13,61 


7 286 


5,35 


24,4 





77,2 





244 


0,45 


772 


3,7 


2 441 


14,05 


7718 


6,57 


25,9 





81,8 





259 


0,5 


818 


4,1 


2 585 


14,18 


8 175 


9,26 


27,4 





86,6 





274 


0,5 


866 


4,36 


2 738 


14,07 


8 659 


11,14 


29 





92 





290 


0,5 


917 


4,61 


2 901 


13,79 


9 170 


9,55 


30,7 





97 





307 


0,5 


972 


4,87 


3 073 


13,46 


9 720 


6,13 


32,5 





103 





325 


0,57 


1 029 


5,19 


3 255 


13,18 


10 290 


3,45 


34,5 





109 





345 


0,69 


1 090 


5,58 


3 447 


12,97 


10 900 


1,75 


36,5 





115 





365 


0,81 


1 155 


5,97 


3 652 


12,84 


11 550 


0,56 


38,7 





122 





387 


0,93 


1 223 


6,35 


3 868 


12,73 


12 230 


-0,46 


41 





130 





410 


1,06 


1 296 


6,72 


4 097 


12,55 


12 960 


-1,3 


43,4 





137 





434 


1,18 


1 372 


7,07 


4 340 


12,16 


13 720 


-1,44 


46 





145 





460 


1,31 


1 454 


7,42 


4 597 


11,4 


14 540 


0,87 


48,7 





154 





487 


1,44 


1 540 


7,77 


4 870 


10,31 


15 400 


2,71 


51,6 





163 





516 


1,57 


1 631 


8,55 


5 158 


9,02 






54,6 





173 





546 


1,69 


1 728 


9,39 


5 464 


7,67 






57,9 





183 





579 


1,82 


1 830 


10,23 


5 788 


6,45 
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Table A.3: 1/1 2 octave correction using center frequencies corresponding to 1/1 2 octave intervals 
as given by the R.40 series of preferred numbers in ISO 3 [20] 



Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Freq. 


Rep. 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


Hz 


dB 


19 





60 





190 





600 


1,89 


1 900 


10,84 


6 000 


5,86 


20 





63 





200 


0,03 


630 


2,09 


2 000 


11,65 


6 300 


5,24 


21,2 





67 





212 


0,13 


670 


2,52 


2 120 


12,54 


6 700 


4,92 


22,4 





71 





224 


0,25 


710 


3,00 


2 240 


13,27 


7 100 


5,15 


23,6 





75 





236 


0,37 


750 


3,46 


2 360 


13,79 


7 500 


5,96 


25 





80 





250 


0,47 


800 


3,95 


2 500 


14,10 


8 000 


8,25 


26,5 





85 





265 


0,50 


850 


4,27 


2 650 


14,14 


8 500 


10,54 


28 





90 





280 


0,50 


900 


4,53 


2 800 


13,96 


9 000 


10,07 


30 





95 





300 


0,50 


950 


4,77 


3 000 


13,60 


9 500 


7,47 ; 


31,5 





100 





315 


0,53 


1 000 


5,03 


3 150 


13,34 


10 000 


4,79 


33,5 





106 





335 


0,63 


1 060 


5,39 


3 350 


13,08 


10 600 


2,57 


35,5 





112 





355 


0,75 


1 120 


5,76 


3 550 


12,90 


11 200 


1,19 


37,5 





118 





375 


0,86 


1 180 


6,11 


3 750 


12,79 


11 800 


0,18 


40 





125 





400 


1,00 


1 250 


6,49 


4 000 


12,63 


12 500 


-0,77 


42,5 





132 





425 


1,14 


1 320 


6,83 


4 250 


12,30 


13 200 


-1,34 


45 





140 





450 


1,26 


1 400 


7,19 


4 500 


11,68 


14 000 


-0,64 


47,5 





150 





475 


1,39 


1 500 


7,61 


4 750 


10,79 


15 000 


1,87 


50 





160 





500 


1,50 


1 600 


8,29 


5 000 


9,72 






! 53 





170 





530 


1,63 


1 700 


9,15 


5 300 


8,38 






56 





180 





560 


1,75 


1 800 


9,9904 


5 600 


7,15 
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